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Abstract — We study the scaling behavior of coupled sparse 
graph codes over the binary erasure channel. In particular, let 
2L + 1 be the length of the coupled chain, let M be the number of 
variables in each of the 2L + 1 local copies, let £ be the number 
of iterations, let Pt, denote the bit error probability, and let e 
denote the channel parameter. We are interested in how these 
quantities scale when we let the blocklength (2L + 1)M tend to 
infinity. Based on empirical evidence we show that the threshold 
saturation phenomenon is rather stable with respect to the scaling 
of the various parameters and we formulate some general rules 
of thumb which can serve as a guide for the design of coding 
systems based on coupled graphs. 

I. Introduction 

Spatially coupled codes 1 1 ) provide an entirely new way of 
approaching capacity. The basic phenomena can be phrased 
as follows: an ensemble constructed by coupling a chain 
of {2L + 1) regular {I, r) low-density parity-check (LDPC) 
ensembles, together with appropriate boundary conditions of 
the chain, exhibits a belief propagation (BP) threshold close 
to the maximum-a-posteriori (MAP) threshold of the regular 
{I, r) ensemble. This phenomenon is known as threshold satu- 
ration and it has been proved rigorously for the binary erasure 
channel (BEC) in |1|. It has also been observed empirically 
for a variety of other channels and other graphical models in 
Qi 105 E). Low-density parity-check convolutional (LDPCC) 
ensembles, first introduced in |5], are the best known example 
of spatially coupled codes. In J6I, the authors reformulate 
LDPCC ensembles in terms of protographs. The BP threshold 
for these codes is computed using density evolution (DE) 
in lis, lH] and it is conjectured that they achieve capacity 
universally across the set of binary-input memoryless output- 
symmetric channels |1|. 

It is probably fair to state that by now the asymptotic perfor- 
mance of spatially coupled LDPC codes is well understood. 
However, much less is known about their scaling behavior 
|[T]. For instance, the DE analysis of LDPCC codes typically 
assumes that L is kept fixed while M tends to infinity. But, 
does the threshold saturation phenomena happen even if L 
grows as a function of Ml In this work, we analyze the finite- 
length performance of LDPCC codes and we study how it 
scales with the coupling dimensions M and L. Our empirical 

This work was supported by grant No. 200021-125347 of the Swiss 
National Foundation and by Spanish government MEC TEC2009-14504-C02- 
{01,02} and Consolider-Ingenio 2010 CSD2008-00010). 



observations indicate that the threshold saturation phenomenon 
happens even when L grows considerably faster than M, 
which indicates that the threshold saturation phenomenon is 
very robust. From our simulation results we synthesize some 
general design rules for these codes. In particular, if the code- 
length is bounded, how should we chose L and M to have 
the best performance? And how does this choice affect the 
decoder complexity (in terms of average number of iterations)? 
These questions, among others, are of significantly practical 
importance. 

The study of the finite-length behavior of LDPCC codes is 
augmented by analyzing their error floor Q. In ||T|, IS], the 
authors prove that the minimum distance of LDPCC codes 
is a fraction of M. These studies concern "large" weight 
codewords. We investigate the occurrence of constant-sized 
codewords/stopping sets, and in particular their scaling. We 
prove that the fraction of codes with no error floor is roughly 
equal to exp(— cL/A/'^^), where c only depends on the rate 
of the code. Hence, for sufficiently small ratios i/Af '~^, it is 
very easy to expurgate the ensemble and to find codes with 
linear minimum distance. 

II. CONVOLUTIONAL-LIKE LDPC ENSEMBLES. BASIC 
DESIGN PARAMETERS 

We define the LDPCC ensembles using protographs f^l . We 
start from a collection of (2L + 1) regular (^, r — kl) LDPC 
protographs with fc € N ||9l and so that I is odd, as shown 
in Fig. fl] f or (Z,r) = (3,6) and L = 9. The regular {l,r — 
kl) code is referred to as the underlying code. The associated 
protograph has k variable nodes of degree I so that, if M is the 
total number of variables per protograph, each variable node of 
the protograph represents M /k variables in total. For instance, 
in Fig. [11 each variable node of the protograph represents A//2 
variables. In the following, we say that the LDPCC graph has 
(2i + 1) sections, one per protograph in Fig. [T] 

Let us now define the coupled protograph. This graph is 
constructed by spatially coupling the protographs in Fig. [T] 
each variable node is connected to its / check node neighbors 
on the left and to its I check node neighbors on the right, 
where I = {I— l)/2 f\\. The coupled protograph is terminated 
by adding I extra check nodes on each side. This process 
is illustrated in Fig. [2] for the case {l,r,L) — (3,6,9). 
In the termination procedure described, the check nodes of 




Fig. 1. A chain of (2L + 1) regular (3,6) non-interacting protograplis for 
L = 9. 



Fig. 2. Coupled protograpli created from a cliain of (2L + 1) regular (3,6) 
protographs for L = 9. 



lower degree on both sides provide better protection for the 
connected variables. However, there is a price to be payed for 
this extra protection - the rate is reduced with respect to the 
rate of the underlying code. The ensemble has n — M{2L + l) 
variable nodes and (2(L + /) + l)M/k check nodes and the 
design rate is: 



R{l,r = kl,L) = 



fc-1 
k 



21 



(1) 



k{2L + l)' 

where the first term is the rate of the underlying code HI. To 
generate a sample of the LDPCC ensemble we now "lift" the 
coupled protograph in the same manner as this is done for 
regular ensembles |9|: we make [M/k) copies of the coupled 
protograph and we connect them by picking for each "edge 
bundle" a random permutation. In the following we refer to 
the ensemble as the {l,r,L,M) (convolutional) ensemble. 

A. Asymptotic analysis of the LDPCC ensemble 

The performance of spatially-coupled ensembles under BP 
decoding as M goes to infinity is analyzed in f3l, |T| using 
density evolution (DE) [ZJ- This allows to compute the BP 
threshold, which defines the limit of the decodable region. 
Let us denote the threshold for the BEC by e^^{l,r,L). We 
have 

lim lim Pb^(e,?,r,L,Af) =0, e<e'^P(/,r,L), (2) 

where P^{e,l,r,L,AI) is the ensemble average bit error 
probability after £ decoding rounds: 



P^{e, I, r, L, M) = ¥.ce(L,^LM) [Hi^, C)]. 



(3) 



Similarly, Pg{e,l,r,L,AI) denotes the ensemble average 
block error probability. One of the key results of the asymp- 
totic analysis of coupled codes is that e^^{l,r,L) is "very 
close" to e'^^^{l,r), the MAP threshold of the underlying 
regular ensemble Q. 



B. Finite-length scaling LDPCC codes 

Finite-length scaling investigates the relationship between 
the performance, the code parameters, and the decoding com- 
plexity. Any practical design of a LDPCC code starts from a 
set of constraints on the code rate in Q, the code length, and 
the number of decoding iterations with the goal of finding the 
best choice of parameters. To first order, one might wonder for 
which scaling of L with respect to AI the threshold saturation 
phenomenon occurs. More precisely, if L = f{M), for what 
functions /(•) does the limit 



lim lim P\ 

£->-oo VAf-i-oo 



BieJ,r,L^f{M),M) 



(4) 



converges to for all e < e^^{l,r,L) as stated in (|2]l? In 
Section IV we investigate this question by testing the code 
performance for several scaling functions f{M) and increasing 
M values. 

III. Decoding complexity 

A practical implementation of a message-passing decoder 
has to set the number of iterations, call it ^min, which ensures 
a reliable decoding in most cases. To be precise, assume that 
we have to design £„!!„ so that the decoder succeeds with 
probability higher than 5. In JS), this task is addressed via 
DE. We have empirically computed the ensemble average 
distribution of the required number of iterations. It is defined 
as follows |7|(Chapter 3, Section 22): 

ipii,e,L,M) = P's-\e,l,r,L,M)~P'B{e,l,r,L,M), (5) 

for £ > 1. Note that the associated cumulative function 

= P%{e,l,r,L,M) - P',°{e,l,r,L,M) 
^l-P^"ie,l,r,L,M), (6) 

provides the probability of successful decoding after £o itera- 
tions. Therefore, .^,„in is chosen so that $(^min, e, L, M) > S. 

It is clear that for e < e^^{l,r), i'min is essentially inde- 
pendent of L and has the same distribution as the distribution 
for the regular {l,r) code of length M. In this regime all 
sections can be decoded at the same time and the effect of the 
boundary condition vanishes once L has become sufficiently 
large. It is easy to give a coarse upper bound on how large L 
has to be for this to be true. Fix the "gap" e^^{l,r) — e > 0. 
We can determine via DE the required number of iterations 
for the {I, r) ensemble to bring down the error probability to 
a desired small value. Assume that L is large compared to 
this number of iterations. Then the effect of the boundary has 
not reached the middle section of the code by the time it has 
essentially decoded. 

On the other hand, for e > e®^(/,r), we expect that 
the number of required iterations scales linearly in L: the 
"decoding wave" starts at the boundaries and moves at a 
constant speed towards the middle j3^|. This can be seen in 
Fig. l3] where we plot the bit error rate (BER) measured in 
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Fig. 3. Bit Error Probability per code section during the decoding process 
for £ = 5 (a), ^ = 30 (b), £ = 70 (c) and £ = 110 (d) for a convolutional 
code with L = 20, M = 1024 and e = 0.44. 



each section of a (/ = 3, r = 6, L = 20, M = 1024) ensemble 
after ^ = 5 (a), £ ^ 'SO (b), £ = 70 (c) and ^ = 110 
(d) iterations for a channel parameter of e = 0.44. In Fig. 
gwe plot ip{£,e,L,M) for L = 5,10,20, M = 256,512 
and e ~ 0.44. We have averaged over 50 code samples. First 
observe that the distribution moves to the right with L and we 
can see that the mean of the distributions scales linearly in L 
- so the larger L the more iterations we need. Further, as M 
increases, the distribution concentrates around its mean. This 
means that for large AI most instances decode with a number 
of iterations which is close to the expected value. However, 
the distributions are heavy-tailed. I.e., over a large interval the 
curves are approximately straight lines, which indicates that 
over this range they follow a power law, i.e., they have the form 
£"/3 for some non-negative constant a and (3. Operationally 
this means that, with non-negligible probability, an instance 
takes many more iterations to decode as it is typical. The last 
two conclusions are quite similar to what can be observed for 
standard LDPC ensembles, see |7|. One strategy to deal with 
the linear increase of the decoding complexity for very large 
chain lengths is the application of a windowed decoder ifTOl . 

IV. Scaling Behavior 

Let us now investigate for what scalings L — f{M) the 
threshold saturation effect appears. We have run a large set 
of simulations for various scaling functions /(A/) and a 
regular [l^r) = (3,6) code has been used to construct the 
ensembles. The BP and MAP thresholds for this ensemble are, 
respectively, €^^{l,r) « 0.4294 and e^^{l,r) « 0.4815. The 
BP decoder is run until all messages have converged (which 
always happens for the BEC). We choose a (/,r) — (3,6) 
code to better illustrate the effect of the error floor in the 
scaling since regular codes with larger degrees, e.g., a (5, 10) 
regular ensemble, have much lower error floors. Our current 
aim however is not to construct optimal codes but to illustrate 
some typical effects. For each (L, AI) pair we only consider 
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Fig. 4. Distribution for the BP required number of iterations, for a regular 
(3,6) code for L e {5, 10, 20} and M e {512, 1024} and e = 0.465. 
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Fig. 5. Bit Error probability P^{t, I, r, L) for a (i = 3, r = 
L is fixed to 100. 



: 6, L, M) code. 



one single sample, randomly taken from the ensemble, as 
described in Section |ll] For each e value and fixed code, we 
consider 10^ transmitted codewords. 

A. Fixed L, Increasing AI 

Consider first the case of constant /(A/). Since this is 
the regime used for the DE analysis, we know that the 
limiting performance i\f — > oo is given by (|2|. We can get a 
negligible error probability as long as we are operating below 
the threshold e^^^{l, r). In Fig. pi we represent the bit error 
probability when L is fixed to 100. As expected, the curves 
become steeper as we increase AI. Note that the curves show 
an error floor. As we discuss in more detail in Section |V] this 
error floor is due to the fact that the ratio L/Af '^^ is relatively 
large for most of these cases. 

B. Fixed AI, Increasing L 

Let us now look at the other extreme, i.e., we fix Ad to some 
constant AIq > and let L grow. Clearly, in this regime we do 



not expect to see the same threshold saturation phenomenon. 
If we consider P^=°°(e. /, ?■, L, Mq) and if we increase L then 
we expect this error probabiHty to be monotonically increasing 
in L since the longer the chain the higher the chance that 
the "decoding wave" gets stuck before reaching the middle. 
In Fig. l6] we plot the error probability for the case A/q ~ 
512. As expected, the error probability is indeed monotonically 
increasing in L and it seems to converge to a limiting curve. 
The determination of this limiting curve is an interesting open 
problem. 
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Fig. 6. Bit Error probability P^{t, I, r, L) for a (i = 3, r = 6, L, M) code. 
M is fixed to 256. 

C. L as a general function of M 

Now where we have investigated the two limiting cases it 
is of interest to scale both M and L together At what scaling 
does the behavior change? In Fig. |7] and Fig. [8] we test the 
scaHng functions L = M/2 and L = {M/iy. In Fig. [s] 
we have included, in dashed lines, the asymptotic ensemble 
error floor derived in Section IV] For both scaling functions the 
performance improves with A/, although in Fig. [8] the speed of 
improvement is slower. Indeed, it seems that in both cases the 
asymptotic threshold still is e'^^^(/, r). This illustrates that the 
threshold saturation phenomenon is quite robust and general. 
Due to the large L/AP^^ values, we observe large error floor 
levels in both cases. Finally, in Fig. [9] we plot the performance 
of an extreme scenario, where L scales exponentially with 
M. The performance now worsens with L, similarly to the 
case considered in Section IV-B A back of the envelope 
calculation, proposed to us by Andrea Montanari, suggests 
that an exponential scaling relationship is exactly the boundary 
- for a subexponential growth of L as a function of M we 
expect the threshold phenomenon to happen whereas for super- 
exponential growths we expect it not to occur 

Let us summarize. The threshold saturation phenomenon 
happens empirically over a very wide range of scalings of L 
with respect to M and far beyond what theory currently can 
predict. This is of comfort to the code designer in the field 
and a challenge for any theoretician. 
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Fig. 7. Bit Error probability P^{^, I, r, L) for a (i = 3, r = 6, L, M) code. 
L is equal to M/2. 
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Fig. 8. Bit Error probability Pjf (e, I, r, L) for a (i = 3, r = 6, L, M) code. 
L is equal to (M/2)'^ . In dashed lines, we represent the asymptotic ensemble 
en-or floor in (9) for M = 128 and M = 256. 
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Fig. 9. Bit Error probability Pj~(e, I, r, L) for a (Z = 3, r = 6, L, M) code. 
L scales exponentially with M. 



V. Error Floor 

In many of the former simulations, we have seen the 
occurrence of error floors. Let us now quickly discuss how 
this error floor can be analyzed. To simplify matters, we 
only consider codewords/stopping sets of weight two since 
codewords/stopping sets of higher weight vanish (in M) at a 
much higher rate. 

Lemma 1 (Convergence to Poisson Distribution): 
Consider an LDPCC ensemble {l,r = kl,L,M). Let C 
be a code sample and N^ be the number codewords with 
Hamming weight two in C. Assume that the code is chosen 
randomly with a uniform probability from the ensemble. 
Then the distribution of N^ converges (in M) to 



Ni^ - Pois (A) , A = fc' 



_2fk\(2L + l) 



Afi-2 



k>2. (7) 



Proof: Note that a codeword of weight two is only formed 
by two variables in the same section, see Fig. |2] that share the 
same set of / check nodes. Since there are (M/k) check nodes 
per section, this happens with probability p = {M/k)^K In 
each section, we count Q){M/k)'^ pairs of variables that can 
form a weight two codeword. Therefore, in a graph with {2L+ 
1) sections, the expected number of such codewords converges 
to A = (2L + l){'^){M/kyp. That the distribution converges 
to a Poisson distribution follows by standard arguments as in 
the case of uncoupled LDPC ensembles, see [7|. ■ 

Corollary 1 (Fraction of Codes with No Small Codewords): 
The fraction of codes in the (/, r = kl, L, M) ensemble with 
no codewords of weight 2 converges to exp(— A). 

Proof: The expected fraction of such codes is given by 
P(7V2^ = 0), which is exp(-A) by Lemma[l] ■ 

The accuracy of Lemma [T] is illustrated in Fig. 



10 where 



we compare the Poisson distribution in (|7]i with some exper- 
imental data, obtained by analyzing lO'' code samples. We 
consider an (Z = 3, r = 6, L = 100, M = 128) ensemble and 
we plot the experimental normalized histogram (o) along with 
the Poisson distribution in (|7]i (*). We can see that both plots 
fit almost perfectly. 

Corollary 2: The expected error floor of an [l, r, L, M) 
ensemble is given by 



P^ie,l,r,L,M)^2 



lJ-2 



M'-i' 



e«e^^a,r,L). (8) 



Proof: In the error floor region, we compute the BER as 
follows: 



P^{e,l,r,L,M)^E. 



Ce{l,r,L,M) 

2Ae2 



(a) 



[H{e,C)]'^E^ 



2N. 



H^2 



■1-2 



Af'-i ' 



M{2L + 1) 
(9) 



M(2L+1) 

where, in step (a), we have assume that the error floor is due 
to codewords of weight two. A given sample C has iV^ of 
such codewords and in average, e^N^ of them are erased. ■ 
From the above observations we can deduce the following. 
For a particular scaling of i = f{M), if A stays bounded 
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Fig. 10. Poisson distribution approximation for A^^ (*) and experimental 
estimation of the pdf (o) for an ensemble (i = 3,r = 6,L = 100, M = 
128). 



from above by a small constant or even tends to 0, then it 
is easy to expurgate the ensemble and hence to avoid error 
floors. This always happens if L grows slower than Af'^^, 
a condition which is easy to achieve in practice. In order 
to illustrate the accuracy of analytical error floor predictions 
we have on purpose considered ensembles that are hard to 
expurgate, i.e., for these ensembles most code samples have 
error floor, which is predicted by (|8]l. For instance, in Fig. IS] 
we have plotted in dashed lines the error floor in (|8]l for the 
cases [M = 128, i = 4096) and (M = 256, L = 16384), 
where we can observe the accuracy of the estimate. 
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